Quantum indirect estimation theory and joint estimate of all moments of two 

incompatible observables 
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We introduce the quantum indirect estimation theory, which provides a general framework to 
address the problem of which ensemble averages can be estimated by means of an available set of 
measuring apparatuses, e. g. estimate the ensemble average of an observable by measuring other 
observable. A main ingredient in this approach is that of informationally complete (infocomplete 
in short) measurements, which allow to estimate the ensemble average of any arbitrary system 
operator, as for quantum tomography. This naturally leads to the more stringent concept of AB- 
informationally complete measurements, by which one can estimate jointly all the moments of two 
incompatible observables A and B. After analyzing all general properties of such measurements, 
we address the problem of their optimality, and we completely solve the case of qubits, showing 
that a er^cry-infocomplete measurement is less noisy than any infocomplete one. We will also discuss 
the relation between the concept of AB-infocompleteness and the notion of joint measurement of 
observables A and B. 
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I. INTRODUCTION 

The aim of any measurement is to retrieve informa- 
tion on the state of a physical system. In classical me- 
chanics, measuring the location on the phase space pro- 
vides a complete information on the system. On the 
other hand, in quantum mechanics there are infinitely 
many elementary measurements — corresponding to dif- 
ferent observables — that provide only partial informa- 
tion, whereas "complementary" informations would re- 
quire mutually exclusive experiments corresponding to 
non-commuting observables. 

The problem then arises on how to perform a quantum 
measurement that can be used to infer information on 
non compatible observables. The idea is to make a gen- 
eralized "unsharp" measurement [l|, described by a so- 
called POVM (positive-operator valued measure), from 
which a specific type of information — such as e. a partic- 
ular ensemble average of a given operator — is retrieved by 
a suitable data-processing of its experimental outcomes. 

Of special interest are the informationally complete 
POVMs ^—infocomplete POVMs in short — which span 
the whole operator space, thus allowing the estimation of 
arbitrary ensemble averages. Informationally complete 
measurements are relevant for foundations of quantum 
mechanics as a kind of "standard" for a purely proba- 
bilistic description Q. Moreover, the existence of such 
measurements with minimal number of outcomes is cru- 
cial for the quantum version of the de Finetti theorem 
The most popular example of informationally com- 
plete measurement is given by the coherent-state POVM 
for a single mode of the radiation field, whose proba- 
bility distribution is the so-called Q-function (or Husimi 
function) Another example, though of completely 
different kind, is the case of quantum tomography [fj, 
in which one measures an observable randomly selected 
from an informationally complete set — a "quorum". 

Investigations on informationally complete measure- 
ments have been extensively carried out. In the frame- 



work of "phase-space observables" @, H, H, OH 03 the 
concept of informational completeness leads to substan- 
tial advancement on some relevant conceptual issues, 
such as the problem of jointly measuring non-commuting 
observables, or the problem of the classical limit of quan- 
tum measurements. A general classification of covariant 
infocomplete measurements has been given using group- 
theoretical techniques [HI], whereas the classification of 
the symmetric ones is still an open problem [l3| . A thor- 
ough comparison of local with global infocomplete mea- 
surements for bipartite quantum systems has been car- 
ried out in Ref. [lj] . On the other hand, for any general 
infocomplete measurement the optimal data-processing 
function for estimating the ensemble average of an ar- 
bitrary operator has been derived [l5| with the help of 
frame theory [1(1 [l7| ■ 

In this paper we introduce the quantum indirect es- 
timation theory, which provides the general framework 
to address the problem of which ensemble averages can 
be estimated by means of an available set of measuring 
apparatuses. Typically, one has the problem of estimat- 
ing the ensemble average of an observable by measuring 
other observables, or of estimating the expectation of a 
POVM — i. e. a probability distribution — by physically 
measuring another POVM. Essentially, one can estimate 
all expectations of operators that are linear combinations 
of POVM elements. The indirect estimation is achieved 
via a data processing of measurement outcomes. The data 
processing associates a numerical value to each outcome, 
depending on the ensemble average to be estimated. The 
final goal of the theory is then to optimize the data pro- 
cessing (generally not unique) in order to maximize sta- 
tistical efficiency 15j. A special case of data-processing 
is the post-processing, which corresponds to probabilistic 
Boolean operations and permutations on the outcomes, 
with the data-processing function corresponding to a con- 
ditional probability. A typical example of post-processing 
is the coarse-graining of a POVM, in which each outcome 
is indeed a union of elementary outcomes, e. .g. in the 
marginalization of a bi-variate POVM. 
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Clearly, a central role in quantum indirect estimation 
theory is played by infocomplete POVM's, by which one 
can estimate the ensemble averages of any arbitrary op- 
erator. However, for the estimation of the ensemble aver- 
ages (A) and (B) of two (noncommuting) operators one 
does not necessarily need an infocomplete measurement, 
even in the case when one wants to estimate the full 
probability distribution of A and B. In the last case one 
just needs a particular measurement, that we will intro- 
duce in the present paper, and which will be referred to 
as AB-infocomplete measurement. Indeed, the necessity 
of estimating complementary observables is the reason 
why the POVM which achieves the task is unsharp, and 
whence it adds noise to the POVM which can estimate 
a single observable. Likewise, one can infer that an AB- 
infocomplete POVM which is not infocomplete should 
add less noise than an infocomplete one, since the first 
kind of measurement avoids to collect redundant informa- 
tion. We will see that this indeed is true in the special 
case of qubits. We will also see that generally a joint 
measurement of observable A and B is not necessarily 
an AB-infocomplete measurement, whereas, viceversa, 
an AB-infocomplete measurement is an unbiased joint 
measurement of A and B. 

The paper is organized as follows. Sec. II is a long 
section where we introduce the quantum indirect estima- 
tion theory through the notion of partially information- 
ally complete POVM, where the linear span of the POVM 
elements is a proper subspace of the Hilbert-Schmidt op- 
erator space. We also briefly review the theory of frames 
[i~6l [ItJ , which generalize the concept of (operator) basis, 
and show how to characterize and optimize the processing 
functions of quantum measurements to estimate the ex- 
pectation of observables. The notion of data-processing 
and post-processing are explained, and the concept of 
joint measurement of observables is recalled. In Sec. Ill 
minimal AB-infocomplete measurements are introduced, 
as the measurements described by POVMs whose span 
coincides with the span of A, B and all their independent 
powers. A useful Lemma that gives sufficient conditions 
for minimality of the optimal AB-infocomplete measure- 
ment is proved. The case of qubits is solved in Sec. III. A, 
when the ensemble of unknown states corresponds to an 
isotropic distribution. Sec. IV is devoted to the conclu- 
sions. 



II. INDIRECT ESTIMATION THEORY 

A measurement on a quantum system Q returns a 
random result e from a set of possible outcomes E = { e : 
1, . . . N}, with probability distribution p(e\p) depending 
on the state p of the system in a way which is distinctive 
of the measuring apparatus, according to the Born rule 

p(e\p) = Ti[pP e }. (1) 

In Eq. (JTJ) P e denote positive operators on the Hilbert 
space H of the system, representing our knowledge of the 



measuring apparatus from which we infer information on 
the state p from the probability distribution p{e\p). Pos- 
itivity of P e is needed for positivity of p(e\p), whereas 
normalization is guaranteed by the completeness relation 
X) e eE P e = I. In the present paper we will only consider 
the simple case of finite discrete set E. More generally, 
one has an infinite probability space E (generally contin- 
uous), and in this context the set of positive operators 
{P e } becomes actually a positive operator valued mea- 
sure (POVM), but we will keep the same acronym also 
for the discrete usual in the literature. Every 

apparatus is described by a POVM, and, reversely, ev- 
ery POVM can be realized in principle by an apparatus 
Throughout this paper we will consider a quan- 
tum system with Hilbert space H with finite dimension 
d = dim(H) < +oo. 

In the following we define the data processing c x for a 
POVM in order to reconstruct the ensemble average (X) 
of an operator X S C(Y\) (c x : i >— > c x is the so-called 
processing function) . 



A. Informationally complete measurements 

We recall that the space of Hilbert-Schmidt operators 
is isomorphic to H® 2 , and coincides with the space £(H) 
of linear operators on H for finite dimensional Hilbert 
space H ~ C d . 

A POVM P is called informationally complete @ if it 
linearly spans the whole operator space £(H). We gen- 
eralize this concept to the following notion of partially 
informationally complete POVM 

Definition 1 For 1Z a linear operator space, we will call 
a POVM IZ-informationally complete, if 1Z C Span(P). 

We have used the natural notation Span(P) = 
Span(Pi, P2, . . . Pjy) G jC(H) . The projection on the lin- 
ear operator space 1Z will be denoted by 11^. 

It is clear that the knowledge of probabilities of an 72.- 
informationally complete POVM allows the calculation 
of ensemble averages (X) p for all X £ TZ by the simple 
formula 

N 

(X) p = J2c? T^lpPi]. (2) 

i=l 

c x denoting the data processing for X. Eq. ^ has to be 
regarded as the definition itself of the processing function 
c x , in the sense that the coefficients c x must satisfy Eq. 
((2|) as a constraint. If the POVM elements are linearly 
independent then the processing function c x : i 1— ► c x for 
an operator X is unique, whereas for linearly dependent 
POVM elements the possible choices are infinite (notice 
that even thorough Eq. ^ explicitly contains the pro- 
cessing function, its value is independent of the specific 
choice of c x ). These facts determine two questions: a) 
how to find a suitable processing function c x for a given 
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operator X; b) which is the processing function c x min- 
imizing the statistical error 

N 

S 2 p (X)=J2\cf\ 2 Tr[pPi\-(X)% (3) 
<=i 

where, for simplicity, we restrict to selfadjoint X (no- 
tice that the actual statistical error is obtained by di- 
viding Sp(X) by i/Nex 1j with N ex the number of ex- 
periments). In order to answer these questions we will 
consider some elementary results in frame theory. 

B. Elements of frame theory 

A frame in a Hilbert space K 0, [13 (for the sake of 
simplicity we will consider finite dimensional spaces) is 
a set of vectors {i>i}i<i<Ar C K, with N < oo such that 
there exist two constants < a < b < oo and 

N 

a|^llK<El^l^| 2 ^ & l^| 2 K' ( 4 ) 
i=i 

and one can prove that for finite dimensional systems the 
property of a set {vi} of being a frame is equivalent to 
completeness, namely for all if) € K one can expand ip 
on the vectors {vi} by suitable coefficients. On the other 
hand, given a set of vectors {vi} on K they are a frame 
iff the frame operator 

N 

F = ^2\v i )(v i \, (5) 

i=l 

is bounded and invertible. In this case, defining the 
canonical dual frame {wi} by = \wi) one has 

N 

FF -i = £ k)W=/) (6) 

i=l 

and clearly the coefficients (wi\ip) are suitable for the 
expansion of ip on the frame {i>j}, namely 

AT 

|^) = 5>,)<^>. (7) 

i=l 

The second interesting result [HI is the following classifi- 
cation of all possible alternate dual frames {zi} such that 

X«=i \ v t)( z i\ = I, which are given by 

N 

\zi) = K) + |yi)-^l2/j)<^>i), (8) 

3=1 

where {jji} C K is arbitrary. If we now consider the 
POVM P and K e Span(P) C £(H), clearly the POVM 
elements are a frame for Span(P), and a suitable pro- 
cessing function c x for an operator X is provided by the 



canonical dual frame. This answers the first question 
about finding processing functions. In the next section 
we will use the classification of alternate duals in Eq. ([5]) 
to answer the second question about the minimization of 
the statistical error. 



C. Optimization of the processing function 

The quantity we want to minimize is the statistical 
error in Eq. ([3]) . Since the processing function is involved 
only in the first term, the quantity to be minimized is the 
following 

JV 

^(X) + (X)2 = ^| c f| 2 Tr[pP l ]. (9) 

i=l 

This quantity depends on the state p, but in a Bayesian 
framework we can make it independent of p by suit- 
ably averaging Eq. ([3]) over a prior ensemble £ = 
U 3 iP]}i<]<m, obtaining 

N M 

5l(X) = J2 |cf | 2 Tr[p £ P,] - JXY £ = Y J 1 J 5%{X), 

i=l j=l 

m (io) 

where p £ := YJjLxijPh and ( x ) 2 £ ■= T,f=i 1j( x )%- 
The only term depending on the processing function is 

\ c f\ Tr[pfPi], which can be viewed as a norm for 
the vector c x of coefficients in a Euclidean space K, where 
the metric matrix tt is diagonal on the canonical basis and 
has eigenvalues iru — Tr[p^Pi]. We can now define the 
linear operator A : K — * Span(P) such that 

N 

Ac = J2ciPi, (11) 

i=l 

which has the following matrix elements A mn ^ = (Pi) mn , 
and all the generalized inverses T : Span(P) — > K of A 
satisfying ArA = A are in correspondence with alternate 
duals D by the identity Ti^ mn = (D*) mn . Generalizing 
the proof for the minimum norm pseudoinverse in Ref. 
[T^ | it was proved in Ref. [l4j that the minimum norm is 
achieved by T satisfying 

TrrA = Atr+TT, (12) 

and the corresponding optimal dual was derived in Ref. 
[HI], and can be expressed as follows 

N 

A = Ai — £{[(/ - M)n(I - MpnhjAj, (13) 

3=1 

where {A^} is the canonical dual and the projection ma- 
trix M has matrix elements My = Tr[AjPj]. The sym- 
bol y* denotes the Moore-Penrose generalized inverse 
of Y , namely the symmetric, minimum norm and least 
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squares generalized inverse Z satisfying the conditions: 
ZYZ = Z, ZY = Y^Z\ YZ = ZW . In the following 
we will make use of the following compact formula for 
the minimum noise, which was derived in Ref. (20| 

8l(X) = (X\{K^K^\X)-JX) £l (14) 

where \X) £ H® 2 is the vector corresponding to X as 
follows 

d 

\X) := X rnn \m)®\n) ^ X, (15) 

m,n— 1 

for fixed basis {\m)}i< m <d in H. The following identities 
are easily verified 

(X\Y) = Tr[X^Y], A®B\X) = \AXB T ), E\X) = \X T ), 

(16) 

where X T is the transpose of X in the canonical ba- 
sis, and E is the swap operator E\<f>) ® \tp) = ® \4>). 
Throughout the paper we will use the following notation 
for orthogonal projectors over Hilbert-Schmidt subspaces 
5C£(H) 

lis '■= orthogonal projector over Span{|V), X € S}. 

(17) 

Since the POVM P is selfadjoint, namely E\P*) = \Pi) 
(F* = (F^) 7, denotes the complex conjugated operator), 

its frame operator F — JV =1 \Pi){Pi I en j°y s the following 
property 

EF*E = F, (18) 

which is clearly shared by its inverse and by Us = F~ 1 F. 
The canonical dual {A;} satisfies then the following iden- 
tity 

E\A*) = EF- U \P*) = F- l E\P*) = F- l \Pi) = |A,), 

(19) 

namely = A,*. Since all alternate duals D satisfy 

N N 

£ lAXi^l = n Span(P) = Mi* pan(p) # = £ \D\)(Pi\, 

i=l i=l 

(20) 

it is clear that if D is an alternate dual then also 
is. It is easy to verify that also l/2[Di + D\) is an al- 
ternate dual. Suppose now that the optimal dual is not 
selfadjoint, then there exists a selfadjoint X such that 
^{Tr[D\X}) ^ 0, and the minimum statistical error for 
X would be 

JV 

51 (X) = £ | T?[D\x]f Tr\p £ P t ] -Jx) £ > 
i=i 

JV 

1 £n(Tr[Dlx}) 2 Tr[psP i \-{X) s = 

i=l 

JV 

^(Tr[(D] + A)X]/2) 2 TrfaPi] W) £ - 

i=l 

(21) 



This is clearly absurd, since the last line is the statistical 
error given by the dual (Di + D\)/2. The canonical dual 
and the optimal dual for any ensemble are then selfad- 
joint. 

Writing the matrix elements of both sides in Eq. (|12p , 
and considering that TAy = Tr[D\Pj], one has 
7Tji Tr[£)|Pj] = Ti[PiDj]TTjj. Summing both sides over 
the index i we obtain Tr^P,] = Tr[D,] TrfpgPj], and 
consequently Tr[Dj] = 1 for all i such that Tr [Pj/Of] ^ 0. 

D. Post- processing 

We will call post-processing of a POVM a data- 
processing which maps the POVM into another POVM, 
namely 

N 

Qj = J2™U\i)Pi, (22) 

i=l 

where m(j\i) is a conditional probability, namely the 
corresponding matrix is Markov, i. e. m(j\i) > and 
J2j m ti\i) = 1 Vi. Clearly the post-processing is a spe- 
cial case of data-processing array, corresponding to 

c^=m(j\z). (23) 

Even though it can be regarded as a special case of data- 
processing, the post-processing is conceptually very dif- 
ferent, being the randomization of set-theoretical opera- 
tions. Indeed, it corresponds to a randomization of the 
following operations 

Tl identification of two outcomes, e. g. j and k are 
identified with the same outcome I, corresponding 
to m(n\j) = din and m(n\k) = 5i n ; 

T2 permutation ir of outcomes, corresponding to 

T3 splitting of one outcome I into two outcomes j and 
k, corresponding to choosing j with probability 
m(J\l) — p and k with probability m(k\l) = 1 — p, 
<p < 1. 

We can see that generally the cardinality of Q is different 
from that of P. Also, notice that a data processing array 
cf 3 for the POVM Q is not necessarily a Markov matrix, 
since generally cf 3 0, and also one not necessarily has 
normalization JV cf 3 = 1 Vi, due to linear dependence 
of the POVM P, even though, there always exists an 
alternate data processing that is normalized. 

When two POVMs P and Q are connected by 
post-processing we will write P >~ Q, and say that 
the POVM P is cleaner under post-processing — post- 
processing cleaner in short — than the POVM Q. The 
relation y is a pseudo-ordering, since it is i) reflex- 
ive, corresponding to P >- P, m(i\j) = &y; ii) 
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transitive, i. e. P >~ Q >- R, corresponding to 

Ri = Ej m d\k)Qk, Qk = £i"*'W)^>=*- Ri = 
Ej m w (»|i)Pi, m"(i\j) = J2 k m(i\k)m'(k\j). 

We can define a partial ordering and an equivalence 
relation in terms of the POVM post-processing as follows. 



Definition 2 The POVM's P and Q are post- 
processing equivalent — in symbols P ~ Q — iff both re- 
lations P y Q and Q >- P hold. 

We are now in position to define cleanness under post 
processing, namely 

Definition 3 A POVMP is post-processing clean if for 
any POVM Q such that Q >- P, then also P >- Q holds, 
namely P ~ Q. 

The characterization of cleanness under post-processing 
is very simple, and is given by the following theorem. 

Theorem 1 pH / A POVMP is post-processing clean iff 
it is rank- one. 

We address the reader to Ref. [2f| for the proof of the 
Theorem. 

For a POVM Q with Q / P i. e. which is not a 
post-processing of P one can anyway introduce another 
smeared-out version Q of Q 

Qj ■■= ?Lt? j * > «i = m ax{0, -c?0 (24) 

such that Q -< P — i. e. Q is a post-processing of P. The 
Markov matrix is simply given by 



m(j\i) 



Qj , 
Cj + gj 

1 + Ei 



(25) 



The perfect measurement of an observable corresponds 
to a POVM made with the orthogonal projectors Xj over 
its eigenspaces, and we will write X = [Xj] with 



XjXi = s^Xi > o, 22 Xj = i. 

i 

More generally, we will say that 



(26) 



Definition 4 A POVM P describes an imperfect mea- 
surement of the observable X if X >~ P, namely the 
POVMP is a post-processing o/X. 

In practical terms this means that the measurement 
is a smearing-out of the perfect observable due to addi- 
tional noise which is ascribed to the output stage of the 
measuring apparatus. One can see that mathematically 
a POVM is a measurement of the observable X when it 
commutes with the observable. In this way the POVM P 
describing an imperfect measurement of X will be simply 
a function Pi = Pi(X) of the operator X. 



The concept of post-processing allows to introduce a 
general notion of joint measurement of (generally non 
commuting) observables. 



Definition 5 (Joint measurement of observables) 

We say that a POVMP achieves the joint measurement 
of the observables X' 1 ', X^ 2 \, . . ., if for every observable 
XW of the list there is a post-processing of P which 
achieves an imperfect measurement of X^ . 



We stress that in our operational point of view it is 
irrelevant that a joint measurement is described by a 
bivariate probability distribution (which could be inter- 
preted in terms of the alleged outcomes of the non com- 
muting observables A and B) . The only thing that mat- 
ters is the possibility of performing jointly imperfect mea- 
surements of both A and B, since, indeed, the joint prob- 
ability of their eigenvalues is counterfactual. 

The present definition of joint measurement for dif- 
ferent observables is sufficiently comprehensive to in- 
clude all known joint measurements, such as the joint 
measurement of position and momentum [22|. and the 
measurement of the direction of an angular momen- 
tum, corresponding to a POVM made with spin-coherent 
states [23|. Indeed, the usual definition of joint mea- 
surement simply involves the marginalization of multi- 
variate POVMs. A natural generalization of such defini- 
tion of joint measurement for non multivariate POVM's 
would be simply to consider the marginalization as the 
identification of outcomes in Tl. Our definition of joint 
measurements further generalizes the notion to any post- 
processing, introducing also the natural transformations 
T2 and T3. 

We should notice that our definition (as the standard 
ones) of joint measurements also includes some trivial 
cases, in particular: a) pure guessing post-processing, 
with Markov matrix with equal columns (data process- 
ing independent of the outcome), corresponding to a 
smeared-out POVM having each element proportional to 
the identity (clearly for such trivial smearing-out each 
POVM is the joint measurement of any set of observ- 
ables); b) the POVM P achieving the joint measurement 
is actually the random selection of one observable at a 
time, namely P = UjAjXW, where we define the convex 
union R = AP U (I — A)Q of two POVMs P and Q with 
cardinalities |P| = N and |Q| = M as follows 



R = AP U (I — A)Q := 

[AP lf . . . , XP N , (I - A)Qi, . . . , (I - X)Qm], 



(27) 



(more generally one can have even the random selec- 
tion of imperfect measurements of noncommuting ob- 
servables). In the following we will call the above joint 
measurements trivial. 
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E. Measuring a POVM by another POVM 

A special case of processing is the one corresponding 
to another POVM Q = (Qi, Q2, ■ ■ ■ , Qm) m the span 
Span(P). Notice that, even though one has the linearity 



of processing functions c. 



X+Y 



c\ , for linearly de- 



pendent POVM P the processing function is not unique, 
whence, generally c\ 7^ 1, which implies that the process- 
ing function cf 3 for the POVM elements Qj generally do 



1. 



not satisfy the normalization condition c ? : 
In addition, generally for X > not necessarily one has 
cf > 0. This implies that cf 3 cannot be treated as condi- 
tional probabilities p(j I i) := cf 3 . Therefore, it is not gen- 
erally true that a POVM Q 6 Span(P) can be achieved 
as a post-processing of P. However, even though Q can- 
not be obtained in this way, this is possible for a blurred 
version of it according to the following theorem 

Theorem 2 Given a POVM Q £ Span(P), there exists 
a POVM Q' -< Q that is a post-processing o/P, or, in 
other words, Q' -< Q and Q' -< P . 

Proof. As shown at the end of Sec. Ill C[ the normaliza- 
tion requirement is satisfied at least by the optimal pro- 
cessing, since c\ = Tr = 1 for all i, for the optimal 
dual D of P. For cf 3 ^ 0, we can consider the "blurred" 
POVM Q(e) with Qi(e) = (l-s)Q i +ej I , which, for suf- 
ficiently large s > has cf 3 ^ > 0. The minimum value 
of e is e* = — 1 ^ fe , where c = min{0, mill,., {c^' }}. ■ 

How can we interpret the indirect measurement of Q? 
In our approach to the theory of statistics of quantum 
measurements the POVM represents a question asked 
by the experimenter, and the answer is the outcome. A 
POVM Q in the space Span(P) associated to the POVM 
P is a question that can be indirectly asked through the 
POVM P, corresponding to the following rule: for given 
outcome i of the POVM P pick the answer j out of the 
set 1, . . . M randomly according to the conditional prob- 
ability p(j\i) = c® 3 ^ e *\ 

If we collect the statistics for the answers j obtained 
through this strategy, we asymptotically obtain the prob- 
abilities 



Tr\pQ j (e.)] = {l-e.)Tr\pQ j ] + jj. 



(28) 



The estimated probabilities are not exactly Tr[pQj], but 
since £» is exactly known, one can retrieve Tr^Qj] by the 
formula 



1 



Tr{pQ 3 ] = T — (Tr[/>Qj(e*)] 



M 



(29) 



The statistical error on such estimate of Tr[pQj] is now 



given by 



Tr \pQj] Tr[/9fPj], and since 



M 



(30) 



the statistical error in the estimate of the probabilities 
TrfpQj] is just ^ times greater than the statistical 

error in the estimate of Tr[pQj(e*)], and the estimated 
probability Tr[pQj] is unbiased. 

Moreover, if the POVM Q is the spectral decomposi- 
tion of an operator X, then one can obtain (X) by taking 
y^,j—i x j(Qj)- The minimum error in the estimate is the 
same that one would obtain by estimating 
(1 - e m )(X) + Tr[X\e*/M, where X(e.) 
and then calculating (X) by taking 



(X) 



M 



Tr[X] 



(31) 



Notice that the coefficients cf 3 can then be interpreted 
as matrix elements of a linear transformation that brings 
eigenvalues Xj of X to the processing function for X 

cf = YljLi c ? 3 x j- If t ne c ? 3 are evaluated through the 
optimal dual, we can say that cf is the best estimate of X 
provided that the outcome i has occurred in a measure- 
ment of the POVM P, since the estimate of (X) rising 
from this strategy has the minimum statistical error. 

From Theorem 7 and Definitions 1 and 6, it follows 
immediately that 

Theorem 3 Every IZ-informationally complete mea- 
surement is an an unbiased joint measurement of all ob- 
servables in Span(7?. U TZ'), 

where we denoted by TV the linear space spanned by the 
adjoints of all operators in 1Z. Moreover, one has 

Corollary 1 Every informationally complete measure- 
ment is a nontrivial joint measurement of all observables. 



III. ylB-INFORMATIONALLY COMPLETE 
MEASUREMENTS 

The problem of estimating the full probability distri- 
bution of two noncommuting observables A and B can be 
treated by considering the space spanned by independent 
powers of A and B, which we call AB-space 



S AB = Span{A n , B n , n = 0, 1, 2, . . .}. 



(32) 



The corresponding projection (in the sense of Eq. (jTTJ)) 
will be denoted by Hab- The POVMs allowing for si- 
multaneous measurement of A, B, and their indepen- 
dent powers are what we call AB -informationally com- 
plete measurements, whose space Span(P) contains Sab- 
Usually in the literature, a self-adjoint operator X = 
J2i XiXi is associated to the observable X, and the prob- 
ability distribution p(i\p) = Tr[Aj/?] is recovered by the 
moments of X through the set of eigenvalues x. L S K. 
The relation between probabilities and moments passes 
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through the identity 



s-l 



X h = £ WjhXi = E E W^Xk 



(33) 



3=0 k=l 



whence X^=o ^jh x k = ^fe> nam ely W is the inverse 
of the Vandermonde matrix W 1 = {xj.}. Linear in- 
dependence of the first s — 1 powers of X (and linear 
dependence of any higher power), where s is the cardi- 
nality of the spectrum of X, follows from the fact that 
the minimal polynomial of X 



m x (x) = \\(x- x h ) 



h=l 

vanishes as mx{X) = 0, and it is the minimal degree 
polynomial vanishing at X, whence all powers X n , < 
n < s— 1, and only such powers, are linearly independent. 

Using Theorem [2] we see that there always exist two 
data processing of an Ai?-infocomplete measurement giv- 
ing two unbiased Abelian POVMs commuting with A and 
B, respectively. Therefore, one has 

Corollary 2 Every AB -informationally complete mea- 
surement is an unbiased joint measurement of observables 
A n and B n , for all integer n. 

A special case is that of minimal AB-informationally 
complete POVMs, whose space Span(P) exactly co- 
incides with Sab- Notice that an example of AB- 
informationally complete POVM is readily given by the 
union of the two orthonormal resolutions of A and B with 
a rescaling by a factor ^ . From this example we can con- 
clude that the projection Hab also enjoys the property 
EH* AB E = Hab- We can translate the two properties 
of simultaneous measurements and AS-informationally 
complete measurements as follows: 



1. P is AS-informationally complete iff 

n j 4Sns p an(P) = I^pan^IUs = ~KaB- (34) 



2. P is minimal AB-informationally complete iff 



ns P an(P) = ^AB- (35) 



Notice that a joint measurement of A and B is gen- 
erally non minimal, e. g. it provides also estimation of 
correlations, which is the case of the joint measurement 
of position and momentum which minimizes the prod- 
uct of uncertainties [22| , or of the covariant measurement 
of the angular momentum [23|. We conjecture that the 
minimum-error POVM's belong to the set of minimal 
AS-informationally complete POVMs. In the next ses- 
sion we will show that for pg £ Sab the conjecture is true 
for dimension d — 2. Moreover, we have the following 



Lemma 1 Sufficient conditions for minimality of the op- 
timal AB -informationally complete POVM Q: 

1. the state ps belongs to Sab; 

2. there exists an optimal POVM P which is AB- 
informationally complete, and such that the opera- 
tors Qi given by \Qi) = HabIPi) o,re all positive. 



Proof. Let us consider the minimum error in Eq. (fTJ 
and recall that we are interested in operators X such that 
Uab\X) = \X). Then 

S £ (X) = (XlUABiATr^A^^UABlX) -JX) £ . (36) 

SinceJH Hab^^A^^-SIab > (nABAir^A^TlAB)- 1 , 

and since Att - ^ = YaLi TVfApg ] l-^K-^l' tnen we nave 
to minimize 

(AKn^ATr^AtiT^)- 1 !*) 
i 



= x 



\Qi)(Qi 



X 



(37) 



where \Qi) — HabIPi}- Notice that Qi is normalized, 
since 



JV 



JV 



E \Qi) = u ab E i p *> = u ab\i) = \I), 



(38) 



i=l 



but in general Qi could not be a POVM because positiv- 
ity is not preserved by the projection Hab- However, we 
require Qi > as a condition, whence Q is a POVM, and 
the optimal processing is then obtained via the optimal 
dual of Qi . 



A. The case of qubits 

The quantum states of a qubit are conveniently repre- 
sented on the Bloch sphere as follows 



1, 

P = 2^ I + n ' <T - ) ' 



(39) 



where er = {ct x , a y , o~ z ) are the three Pauli operators, and 
n is a vector with norm ||n|| < 1. Since any positive 
operator is proportional to a state, any POVM can be 
represented as follows, 



Pi = a t l + f3iO- x + jiO-y + 5iG z , 



(40) 



where {o^}, {7^} and {Si} are positive coefficients 

such that 



A 2 +7- a 2 >0, 
and the normalization is given by 

N N N N 



(41) 



= E^ = E^ = E^ = °- ( 42 ) 



i=l i=l 
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Notice that apart from a multiplication factor and a uni- 
tary transformation any couple of noncommuting trace- 
less operators A and B is equivalent to the following one 



a ± (0) 



(7 T COS < 



± <jy sin ( 



(43) 



whose commutator is ia z sin 29. Therefore, without loss 
of generality, we will restrict attention to c±(0) [26| . 
Now Sab = Span{<7 x , <r y , I}, and we consider the case 
of ps € Sab- Let us take a general POVM P such that 
Hs{p)K<T X ,a y = n^^. By definition, such POVM is 
a x , cr^-informationally complete. We can now prove that 
the operators {Qi} defined by \Qi) = IL^ - |Pj) make a 
POVM. The normalization can be proved as in Eq. 
On the other hand, Qi is positive, and this can be proved 
considering Eq. (|4*0)l . In fact, acting with 11^^ on Pi 
one has 



n„ 



,\Pi) = \Qi) = aW+PM^+lMy)- (44) 



Clearly, the conditions for positivity in Eq. (j4Tj) are still 
satisfied. We have then proved that {Qi} is a minimal 
a x , er^-informationally complete POVM. Moreover, since 
Ps € Sab, then 



Tr[P lP£ ] = (P z \n 



\Pe) 



Tr[QiPe], (45) 



namely P and Q give the same probability distribution 
over the state ps , whence they will have the same expec- 
tations when averaging over the ensemble S. Therefore, 
we are in the conditions of Lemma 1, whence for optimal 
P the constructed Q is optimal and minimal. 

From now on, we will consider POVMs P such that 
H <7x <7y \P i ) — \Pi). Moreover, we will restrict our atten- 
tion to ensembles with a isotropic distribution, having 
ps = 5. In this case iru = Tr[Pj]/2 = ai. It is clear that 
we can consider rank one POVMs, since if P, is rank 2 for 
some i, then its spectral decomposition can be written as 



Pa 



X 2 I 



P 



A2 — Ai A2 — Ai 



P l 2 = 



Ai/ 
Ai-Aa 



P 



Ai — A2 

(46) 

where Xj are the two eigenvalues of Pj. The spectral 
projections belong then to the space a Xl a v , being lin- 
ear combinations of / and Pi. Consequently, any a x ,a y - 
informationally complete POVM can be simulated by a 
rank one a x , cTy-informationally complete, whence there 
exists an optimal minimal rank-one POVM which is 
a x , (j^-informationally complete. 

Rank-one minimal a x , tjy-informationally complete 
POVMs can be easily characterized by restricting the 
conditions in Eq. (|4"Tj) as follows 



K + ii 



oti > 0. 



(47) 



The matrix Att can be written as 



which is represented on the orthonormal basis 
{^I-Oj 75!^)' 75 l CT v)} in tne block-diagonal form 



Att^A* 



2 
K 



(49) 



with K being the 2x2 matrix 



K = 2 



JY 



N 



(50) 



The inverse can be easily calculated, and is equal to 
(\ 



(A^^t)- 1 





2 7? 2 Y^Af fan 



D Zji=1 oti D Zjj=l an I ' 

\ u D l^i=\ on D Z^i=l a 

(51) 

where D = det(i-T). 

Using this expression we can evaluate the error for 
a ± (9) 



*Vl(0)) ■■■ (I 



B 
D 



q= 2 sin cos [ — + {a x ){a y ) 



A 



(52) 



where we defined T :— 2V", — 

^i—l a 



Ti 



B 



9X^ N §1 



and 



A : = - 2 Eili and consequently D = BY - A 2 . 
The total error Sj(9) := <5§(<7 + (0)) + <J§(a-_(0)) is given 



by 



51(0) = 2 



B 
D 



and we can prove by the following argument that the 
optimal POVM is such that A = 0. Indeed, consider a 
POVM P with given coefficients ai,/3j,7i corresponding 
to given values for B, Y, A. Now consider the POVM 
P' with the same coefficients a\ = ai and /?• — (3i as 
P and with 7? = —7$, corresponding to B' = B, T' = 
r' and A' = -A. If we take now the POVM P" := 
5 (Pi , ... , Pn ,P{,---, Pjv ) 7 then the corresponding values 
can be readily calculated to be B" = B = B' , Y" = Y = 
T' and A" = 0. Correspondingly, the expression for the 
determinant D" becomes D" = BY > D = BY - A 2 . 
Since the POVM P" can be constructed from any POVM 
P, then clearly the optimal POVM minimizing the total 
noise 5|(0) is such that A = 0. 

Then, we have that (A7r _1 A^) _1 becomes diagonal 
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For rank-one POVMs, notice that J2i=i 
namely, B + T = 2, and the total error is given by 

'cos 2 t 



B 



2-B 



(55) 

with k = \{{a+(9)) 2 £ + (cr_(6»)> 2 £ ). The minimum of 
Eq. (|55|l as a function of B can be easily obtained, leading 
to the following bound for the total error 

<$!(o+(0))+<$!(o-_(0)) > 2(1 + sin 20 -k). (56) 

We will now provide two POVMs that achieve the bound. 
The first one has the following three elements 



Pi = p(I + a x ), 



D 1-Pr P , V 1 - 2 P 
2± = 12 ~~ 2 CT:c 2 ^ 



(57) 



with p 
ments 



2 cos #-fsin # 



, and the second one has four ele- 



P 1± = 1 -(I±a x ), 
P2± = ^(I±a v ), 



(58) 



with p 



For equal uncertainties, the mini- 



cos #+sin 6 ' 

mum product of the r.m.s. errors is given by 



^!(a + (0))^i(a_(0)) = Sj(a ± (9)) = 1 + sin 29 k. 

(59) 



We recall that the results of the qubit case from Eq. 
(|46|) to Eq. l[59|) are obtained under the assumptions 
of isotropic ensemble ps — i. In fact, we want to stress 
that even in the qubit case, whenever pg correspond- 
ing to the prior ensemble is not fully lying in the space 
a x ,a y , it is not proved that the optimal POVM is a x , a y - 
informationally complete. 



IV. CONCLUSIONS 



In this paper we have introduced the concept of AB- 
informationally complete measurements, within the con- 
text of Quantum indirect estimation theory. Compared 
with a customary infocomplete measurements, the AB- 
infocomplete one in principle allows a less noisy joint esti- 
mation of all the moments of two noncompatible observ- 
ables A and B. The concept of AB can be also easily ex- 
tended to more than two observable, but we have not an- 
alyzed such generalization. We solved the case of qubits, 
showing that a cr^cry-infocomplete measurement is less 
noisy than any infocomplete one. The relation between 
the concept of AS-infocompleteness and the notion of 
joint measurement of observables A and B has also been 
discussed. The relation between minimality and optimal- 
ity of ylS-infocomplete measurements remains an open 
problem. 
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] The proof that Hab^ti' 1 A^^IIab > 
(Uab An- 1 A f n ab)' 1 is the following. Consider two 
positive invertible operators X and Y such that 
X > Y. Then we have Y~^XY~^ > /, and conse- 
quently (Y-iXY-i)- 1 = Y^X^Yi < I, and finally 
X' 1 < Y~ x . Now one can prove [241 ] that for a general 
invertible positive operator X and any projection II one 

has nx _1 n = (nxn - ux[(i - n)x(i - n)] _1 jfn) _1 , 

where the inverse [(I — TV)X(I — II)] -1 is on the support 
of I — II. Now, clearly 

nxn - nx[(i - u.)X(i - n)] _1 .xTi < nxn, 



and consequently njf _1 n > (ILXTI) -1 . 
[26] Notice that it is irrelevant to add a trace to A and B, 
since this can be done by adding an operator proportional 
to the identity, e. g. X' — X + kl, and the minimum error 
in the estimation of (X') would be 

JV 

5%{X') = £(cf + kf Tr[p £ P t ] - (<X> £ + kf = 5l{X), 
i=l 

since the processing function of the identity for the opti- 
mal processing is cf = 1. 



